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Abstract 


Numerous  problems  in  numerical  analysis,  including  matrix  inversion,  eigenvalue  calculations 
and  polynomial  zero  finding,  share  the  following  property:  the  difficulty  of  solving  a  given 
problem  is  large  when  the  distance  from  that  problem  to  the  nearest  "ill-posed"  one  is  small. 
For  example,  the  closer  a  matrix  is  to  the  set  of  noninvertible  matrices,  the  larger  its  condi- 
tion number  with  respect  to  inversion.  We  show  that  the  sets  of  ill-posed  problems  for  matrix 
inversion,  eigenproblems,  and  polynomial  zero  finding  all  have  a  common  algebraic  and 
geometric  structure  which  lets  us  compute  the  probability  distribution  of  the  distance  from  a 
"random"  problem  to  the  set.  From  this  probability  distribution  we  derive,  for  example,  the 
distribution  of  the  condition  number  of  a  random  matrix.  We  examine  the  relevance  of  this 
theory  to  the  analysis  and  construction  of  numerical  algorithms  destined  to  be  run  in  finite 
precision  arithmetic. 

AMS  classifications:  15A12,  53C65,  60D05 


1.  Introduction 

To  investigate  the  probability  that  a  numerical  analysis  problem  is  difficult,  we  need  to 
do  three  things: 

1)  Choose  a  measure  of  difficulty, 

2)  Choose  a  probability  distribution  on  the  set  of  problems, 

3)  Compute  the  distribution  of  the  measure  of  difficulty  induced  by  the  distribution  on  the 
set  of  problems. 

The  measure  of  difficulty  we  shall  use  in  this  paper  is  the  condition  number,  which  meas- 
ures the  sensitivity  of  the  solution  to  small  changes  in  the  problem.  For  the  problems  we  con- 
sider in  this  paper  (matrix  inversion,  polynomial  zero  finding  and  eigenvalue  calculation), 
there  are  well  known  condition  numbers  in  the  literature  of  which  we  shall  use  slightly  modi- 
fied versions  to  be  discussed  more  fully  later.  The  condition  number  is  an  appropriate  meas- 
ure of  difficulty  because  it  can  be  used  to  measure  the  expected  loss  of  accuracy  in  the  com- 
puted solution,  or  even  the  number  of  iterations  required  for  an  iterative  algorithm  to  con- 
verge to  a  solution. 

The  probability  distribution  on  the  set  of  problems  for  which  we  will  attain  most  of  our 
results  will  be  the  "uniform  distribution"  which  we  define  as  follows.  We  will  identify  each 
problem  as  a  point  in  either  IRN  (if  it  is  real)  or  CN  (if  it  is  complex).  For  example,  a  real  n 
by  n  matrix  A  will  be  considered  to  be  a  point  in  IR""",  where  each  entry  of  A  forms  a  coordi- 
nate in  IR"'  in  the  natural  way.  Similarly,  a  complex  nth  degree  polynomial  can  be  identified 
with  a  point  in  C""1"1  by  using  its  coefficients  as  coordinates.  On  the  space  JRN  (or  CN)  we 
will  take  any  spherically  symmetric  distribution,  i.e.  the  induced  distribution  of  the  normal- 
ized problem  jc/|[jc||  (||-||  is  the  Euclidean  norm)  must  be  uniform  on  the  unit  sphere  in  IR  . 
For  example,  we  could  take  a  uniform  distribution  on  the  interior  of  the  unit  ball  in  IR  ,  or 
let  each  component  be  an  independent  Gaussian  random  variable  with  mean  0  and  standard 


deviation  1.  Our  answers  will  hold  for  this  entire  class  of  distributions  because  our  condition 
numbers  are  homogeneous  (multiplying  a  problem  by  a  nonzero  scalar  does  not  change  its 
condition  number). 

The  main  justification  for  using  a  uniform  distribution  is  that  it  appears  to  be  fair:  each 
problem  is  as  likely  as  any  other.  However,  it  does  not  appear  to  apply  in  many  practical 
cases  for  a  variety  of  reasons,  including  the  fact  that  any  set  of  problems  which  can  be 
represented  in  a  computer  is  necessarily  discrete  rather  than  continuous.  We  will  discuss  the 
validity  of  our  choice  of  uniform  distribution  as  well  as  alternatives  at  length  in  section  6 
below. 

Finally,  given  this  distribution,  we  must  compute  the  induced  probability  distribution  of 
the  condition  number.  It  turns  out  that  all  the  problems  we  consider  here  have  a  common 
geometric  structure  which  lets  us  compute  the  distributions  of  their  condition  numbers  with  a 
single  analysis,  which  goes  as  follows: 

(i)  Certain  problems  of  each  kind  are  ill-posed,  i.e.  their  condition  number  is  infinite. 
These  ill-posed  problems  form  an  algebraic  variety  within  the  space  of  all  problems.  For 
example,  the  singular  matrices  are  ill-posed  with  respect  to  the  problem  of  inversion, 
and  they  lie  on  the  variety  where  the  determinant,  a  polynomial  in  the  matrix  entries,  is 
zero.  Geometrically,  varieties  are  possibly  self-intersecting  surfaces  in  the  space  of 
problems. 

(/'/')  The  condition  number  of  a  problem  has  a  simple  geometric  interpretation:  it  is  propor- 
tional to  (or  bounded  by  a  multiple  of)  the  reciprocal  of  the  distance  to  the  set  of  ill- 
posed  problems.  Thus,  as  a  problem  gets  closer  to  the  set  of  ill-posed  ones,  its  condition 
number  approaches  infinity.  In  the  case  of  matrix  inversion,  for  example,  the  traditional 
condition  number  is  exactly  inversely  proportional  to  the  distance  to  the  nearest  singular 
matrix. 

(Hi)  The  last  observation  implies  that  the  set  of  problems  of  condition  number  at  least  x  is 
(approximately)  the  set  of  problems  within  distance  clx  (c  a  constant)  of  the  variety  of 
ill-posed  sets.  Sets  of  this  sort,  sometimes  called  tubular  neighborhoods,  have  been  stu- 
died extensively  by  geometers.  We  will  present  upper  bounds,  lower  bounds,  and 
asymptotic  values  for  the  volumes  of  such  sets.  The  asymptotic  results,  lower  bounds, 
and  some  of  the  upper  bounds  are  new.  The  formulae  are  very  simple,  depending  only 
on  x,  the  degree  N  of  the  ambient  space,  the  dimension  of  the  variety,  and  the  degree  of 
the  variety.  These  volume  bounds  in  turn  bound  the  volume  of  the  set  of  problems  with 
condition  number  at  least  x.  Since  we  are  assuming  the  problems  are  uniformly  distri- 
buted, volume  is  proportional  to  probability. 

Thus,  for  example,  we  will  prove  that  a  scaled  version  k(A)  =  ||A||r-||A_1||  of  the  usual 
condition  number  of  a  complex  matrix  with  respect  to  inversion  satisfies 


1 — -\ =s  Prob(K(A)  &  x)  <  * — 

and  that  asymptotically 

Prob(K(A)  *  x)  =    W("2~1)    +  o(\) 


In  other  words,  the  probability  that  the  condition  number  exceeds  x  decreases  as  the  square 
of  the  reciprocal  of  x.  Even  for  moderate  x  the  upper  bound  exceeds  the  asymptotic  limit  by 
a  ratio  of  only  about  e2n2 .   If  A  is  real  we  will  show 

CV~  ^)"2"'    *  Prob(K(A)  *  *)  *   2  2-  ["/I-  N* 

where  C  is  a  constant  proportional  to  the  n2-  1-dimensional  volume  of  the  set  of  singular 


matrices  inside  the  unit  ball.  Thus,  for  real  matrices  the  probability  that  the  condition  number 
exceeds  x  decreases  as  x_1 . 

There  are  a  number  of  open  questions  and  conjectures  concerning  these  volume  bounds, 
in  particular  for  how  general  a  class  of  real  varieties  they  apply  (the  case  of  complex  varieties 
is  simpler).  We  will  discuss  the  history  of  this  work  and  open  problems  in  detail  in  section  4 
below. 

It  turns  out  that  the  reciprocal  relationship  between  condition  number  and  distance  to 
the  nearest  ill-posed  problem  holds  for  a  much  wider  class  of  problem  than  just  matrix  inver- 
sion, polynomial  zero  finding  and  eigenvalue  calculations:  it  is  shared,  at  least  asymptotically, 
by  any  problem  whose  solution  is  an  algebraic  function.  For  simplicity  we  shall  restrict  our- 
selves to  the  three  aforementioned  problems,  but  our  results  do  apply  more  widely,  as  dis- 
cussed in  section  3  below  and  in  [4]. 

This  work  was  inspired  by  earlier  work  in  a  number  of  fields.  Demmel  [4],  Gastinel  [7], 
Hough  [14],  Kahan  [15],  Ruhe  [24],  Stewart  [28],  Wilkinson  [35,  36,  37]  and  others  have 
analyzed  the  relationship  between  the  condition  number  and  the  distance  to  the  nearest  ill- 
posed  problem  mentioned  above  in  (/'/).  Gray  [8,  9],  Griffiths  [11],  Hotelling  [13],  Lelong 
[20],  Ocneau  [21],  Renegar  [23],  Santalo  [25],  Smale  [26],  and  Weyl  [32]  have  worked  on 
bounds  of  volumes  of  tubular  neighborhoods.  These  volume  bounds  have  been  used  by  Smale 
[26,  27],  Renegar  [23]  and  others  to  analyze  the  efficiency  of  Newton's  method  for  finding 
zeros  of  polynomials.  This  latter  work  inspired  the  author  [3]  to  apply  these  bounds  to  condi- 
tioning. Ocneanu  [22]  and  Kostlan  [19]  have  also  analyzed  the  statistical  properties  of  the 
condition  number  for  matrix  inversion. 

The  rest  of  this  paper  is  organized  as  follows.  Section  2  defines  notation.  Section  3 
discusses  the  relationship  between  conditioning  and  the  distance  to  the  nearest  ill-posed  prob- 
lem. Section  4  presents  the  bounds  on  the  volumes  of  tubular  neighborhoods  we  shall  use, 
and  states  some  related  open  problems.  Section  5  computes  the  distributions  of  the  condition 
numbers  of  our  three  problems.  Section  6  discusses  the  limitations  of  assuming  a  uniform  dis- 
tribution and  suggests  alternatives  and  open  problems.  Section  7  contains  the  proofs  of  the 
theorems  in  section  4. 


2.  Notation 

We  introduce  several  ideas  we  will  need  from  numerical  analysis,  algebra,  and 
geometry.  ||x||  will  denote  the  Euclidean  norm  of  the  vector  x  as  well  as  the  induced  matrix 
norm 

"  "  58  *m    ■ 

\\A\\f  will  denote  the  Frobenius  norm 

I|a||f-(2  M2)m  ■ 

ij 
If  P  is  a  set  and  x  is  a  point,  we  will  let  d'ist(x,P)  denote  the  Euclidean  distance  from  x  to  the 
nearest  point  in  P. 

A  subset  M  of  IRN  is  called  an  n- dimensional  manifold  if  it  is  locally  homeomorphic  to 
IR".  We  also  write  n  =  dim(M).  The  codimension  of  M,  written  codim(M),  is  N-n.  In  this 
paper  dimension  will  always  refer  to  the  real  dimension  rather  than  the  complex  dimension, 
which  is  half  the  real  dimension. 

A  variety  is  the  set  of  solutions  of  a  system  of  polynomial  equations.  A  variety  is  homo- 
geneous if  it  is  cone-shaped,  i.e.  if  x  is  in  the  variety  so  is  every  scalar  multiple  ax.  A  variety 
is  not  generally  a  manifold  since  it  can  have  singularities  in  the  neighborhood  of  which  it  is 


not  homeomorphic  to  Euclidean  space.  However,  points  q  with  relatively  open  neighbor- 
hoods UqCP  that  are  manifolds  are  dense  in  P  [17,  Theorem  4.2.4]  so  that  the  following 
definition  makes  sense:  the  dimension  of  P  at  p ,  written  d\mp{P),  is 

dimp(F)  ^     limsup     dim(f/9)    . 

qtVqCP 

Uq  a  manifold 

We  in  turn  define  the  dimension  of  the  variety  P  as  the  maximum  over  all  p  €P  of  dimp(P).  If 
dim/,(/>)  is  constant  for  all  p,  we  call  P  pure  dimensional.  A  complex  variety  defined  by  a 
single  nonconstant  polynomial  is  called  a  complex  hypersurface.  Complex  hypersurfaces  are 
pure  dimensional  with  codimension  2.  A  real  hypersurface  has  codimension  1  everywhere. 
The  real  variety  defined  by  the  polynomials/!,  .  .  .  ,fp  is  called  a  complete  intersection  if  it  is 
pure  dimensional  of  codimension  p. 

Now  we  define  the  degree  of  a  purely  /i-dimensional  variety  P  in  IRA .  Let  LN~"  be  a 
N  —  n  dimensional  linear  manifold  (plane)  in  IRA.  Since  dim(LA  _,1)-+-dim(/>)=dim(lRA)  =  iV 
we  say  that  LA~"  and  IR^  are  of  complementary  dimension.  Generically,  Lh~"  and  P  will 
intersect  in  a  surface  of  codimension  equal  to  the  sum  of  their  codimensions,  that  is  N.  In 
other  words,  their  intersection  will  be  of  dimension  0  (a  finite  collection  of  points).  If  P  is  a 
complex  homogeneous  variety,  then  for  almost  all  planes  LA_"  this  collection  will  contain  the 
same  number  of  points,  and  this  common  number  is  called  the  degree  of  P,  and  is  written 
deg(P)  (see  [17,  Theorem  4.6.2]).  Intuitively,  deg(P)  gives  the  number  of  "leaves"  of  the 
variety  P  that  a  typical  plane  LA_n  will  intersect.  In  the  case  of  a  nonhomogeneous  or  real 
variety  P,  deg(P)  is  defined  analogously  as  the  maximum  (finite)  intersection  number  of  a 
plane  LN~"  and  the  n-dimensional  variety  P  in  1RN ,  although  the  intersection  number  will  not 
generally  be  constant  for  almost  all  LN~". 

This  concept  of  degree  is  a  generalization  of  the  degree  of  a  polynomial.  Indeed,  if  P  is 
complex  and  defined  as  the  solution  set  of  a  single  irreducible  polynomial,  then  the  degree  of 
the  polynomial  equals  the  degree  of  P  as  defined  above  [17]. 

By  l-volume  of  an  n-dimensional  manifold  M  (l^n)  we  mean  the  /-dimensional  Lebesgue 
measure  of  M,  if  it  exists.  Note  that  if  l>n  this  volume  is  zero.  The  notations  vol(Af)  and 
vol„(M)  denote  the  n-volume  of  the  n-dimensional  manifold  M. 


3.  Condition  Numbers  and  the  Distance  to  the  Nearest  Ill-Posed  Problem 

We  claim  that  many  classes  of  numerical  analysis  problems  permit  the  following 
geometric  characterization  of  their  condition  numbers: 

(i)  Certain  problems  of  each  class  are  ill-posed,  i.e.  their  condition  numbers  are  infinite. 
These  problems  form  a  variety  within  the  space  of  all  problems. 

(ii)  The  condition  number  of  a  problem  has  a  simple  geometric  interpretation:  it  is  propor- 
tional to  (or  bounded  by  a  multiple  of)  the  reciprocal  of  the  distance  to  the  set  of  ill- 
posed  problems.  Thus,  as  a  problem  gets  closer  to  the  set  of  ill-posed  ones,  its  condition 
number  approaches  infinity. 

In  this  section  we  will  cite  results  from  the  literature  to  prove  these  claims  for  the  fol- 
lowing three  classes  of  problems:  matrix  inversion,  polynomial  zero  finding,  and  eigenvalue 
calculation.  Afterwards  we  will  outline  why  this  characterization  applies  to  many  other  prob- 
lems as  well  [4]. 

First  we  need  to  define  condition  number  more  precisely.  If  X  is  our  space  of  problems 
equipped  with  norm  ||-||x,  Y  our  space  of  solution  equipped  with  norm  ||-||y,  and  /  -JC^Y  is  the 
solution  map  for  our  problem,  the  usual  definition  of  the  relative  condition  number  is 


*rel(f,x)  ■    limsup  (3.1) 

6*-0  ll6*!!*  '    ||*||x 

||P/(*)ib-||*||x 

llrt*)l|y 

if  the  Jacobian  Df  exists  (||-|[at  is  the  induced  norm).  In  many  cases  the  essential  information 
about  the  conditioning  is  contained  in  the  ||D/]|at  factor.  We  may  therefore  use  a  multiple  of 
||DF||xy  instead  of  k„i  without  losing  essential  information. 

All  three  of  our  problems  are  homogeneous:  multiplying  the  problem  by  a  scalar  does 
not  change  the  condition  number.  Therefore,  the  set  of  ill-posed  problems  will  also  be 
homogeneous,  or  cone-shaped.  This  permits  us  to  normalize  all  our  problems  to  have  unit 
norm  (lie  on  the  unit  sphere  in  either  IR/1"'  or  CA),  and  implies  that  any  results  on  the  distribu- 
tion of  the  condition  number  will  hold  for  any  distribution  of  problems  inducing  the  same  dis- 
tribution of  jc/||jc|[  on  the  unit  sphere. 

Matrix  Inversion:  The  usual  relative  condition  number  as  defined  in  (3.1)  with  the  ||-||  norm 
on  both  the  problem  and  solution  spaces  is  [10]: 

K„,(A)  =  HAIHIA-'II    . 

We  shall  use  the  nearly  equivalent  condition  number 

k(a)»  iiaiihia-'h  • 

These  condition  numbers  are  both  homogeneous,  and  infinite  when  A  is  singular,  so  the  set 
of  ill-posed  problems  is  a  variety  defined  by  the  single  n-th  degree  homogeneous  irreducible 
polynomial  det(A)  =  0,  where  n  =  dim(A)  [31].  Denote  the  set  of  ill-posed  problems  by  IP. 
From  the  last  section,  we  see  that  if  A  is  complex,  IP  is  a  complex  hypersurface.  If  A  is  real, 
it  is  easy  to  verify  that  IP  is  still  a  real  hypersurface  by  using  the  explicit  parameterization 
provided  by  Gaussian  elimination  [3]. 

A  theorem  of  Eckart  and  Young  [5]  gives  the  distance  from  a  nonsingular  matrix  to  IP: 
Theorem  3.1:  [5]  dist(A,/P)  =  HA-1!!"1. 
(Gastinel  [7]  proved  this  result  for  an  arbitrary  operator  norm.) 

Therefore,  we  see  that  in  terms  of  k  we  may  write 

if  \\A\\F  =  1  ,  then    dist(A,//>)  =  1/k(A)    ,  (3.2) 

i.e.  that  the  distance  from  a  normalized  problem  A  to  the  nearest  ill-posed  problem  is  the 
reciprocal  of  its  condition  number. 

Polynomial  Zero  Finding:  In  this  case  we  are  interested  in  the  sensitivity  of  the  zeros  of  a 
polynomial  to  small  perturbations  in  the  coefficients.  If  p  (*)  is  an  nth  degree  polynomial,  let 
\\p  II  denote  the  Euclidean  norm  of  the  vector  of  its  coefficients.  If  p  (z)  =  0  and  8p  is  a  small 
perturbation  of  p,  it  is  easy  to  verify  that  to  first  order  the  perturbed  polynomial  p  +hp  has  a 
zero  at  z  +  8z  where 


Bz  = 


■8p(z) 


P'(*) 

implying  that  the  relative  condition  number  is 

Krf;0,z) 


n(z)-\\p\ 


P'MI 


where 


»«-  k_1l(2  l*2'l) 


1/2 


Note  that  the  condition  number  depends  both  on  the  polynomial  p  and  the  choice  of  zero  z. 
For  simplicity  we  will  use  the  similar  condition  number 

k(   z)  s     Up II      , 
[p'  ]      !/>'(*)  I 

Both  condition  numbers  are  infinite  when  p'(z)  =  Q,  i.e.  when  z  is  a  multiple  zero.  Thus  we 
will  take  the  set  IP  of  ill-posed  problems  to  be  those  polynomials  with  multiple  zeros.  A 
necessary  and  sufficient  condition  for  a  polynomial  to  have  a  multiple  zero  is  that  its  discrim- 
inant, an  irreducible  homogeneous  polynomial  of  degree  In  -2  in  the  coefficients  of  p  [31], 
be  zero.  If  p  is  complex,  this  implies  the  set  of  polynomials  with  zero  discriminant  is  a  hyper- 
surface.  If  p  is  real,  this  set  of  polynomials  is  still  a  hypersurface,  as  may  be  verified  using 
the  parameterization  provided  by  the  leading  coefficient  p„  and  the  zeros.  The  discriminant 
may  also  be  zero  if  the  two  leading  coefficients  of  p  equal  zero  (corresponding  to  a  double 
eigenvalue  at  °°),  but  this  set  is  a  subvariety  of  double  the  codimension  of  the  hypersurface  in 
which  it  lies,  and  so  forms  a  set  of  measure  zero  we  may  neglect. 

Now  we  need  to  estimate  the  distance  from  a  given  polynomial  to  one  with  a  multiple 
zero.  The  estimate  we  shall  use  is  due  to  Hough  [14]  and  says 

Theorem  3.2:  [14,  4]  The  distance  dist(p,/P)  from  the  polynomial  p  of  degree  at  least  2  to 
one  with  a  multiple  zero  is  bounded  by 

dist(p,//>)<  y/2-\p'{z)\ 

where  p  (z)  =  0. 

In  fact,  this  is  quite  a  weak  result  gotten  by  estimating  the  smallest  change  in  p  needed 
to  make  a  double  zero  at  z,  which  turns  out  to  be  a  linear  least  squares  problem.  Thus  we 
may  write 

V7 

if  ||p  ||  =  1,  then    dist(p, IP)  £  — ; r  (3.3) 

i.e.  that  the  distance  from  a  normalized  problem  p  to  the  nearest  ill-posed  problem  is 
bounded  by  a  multiple  of  the  reciprocal  of  its  condition  number. 

To  see  how  much  (3.3)  may  overestimate  dist(p,//>),  we  present  a  lower  bound.  Note 
that  by  changing  the  argument  of  p  from  x  to  ax  (a  scalar)  we  may  make  the  leading  coeffi- 
cient p„  larger  than  the  other  coefficients. 

Theorem  3.3:  [4]  Assume  that  p  is  an  n-th  degree  polynomial  satisfying  ||p||=l  and 
|Pi  l<  \Pn  \ln  for  i<n.  Then 

1             .0235       x 
dist(p,/P)  :>      min      (-r-  ,      2     2, r)  (3.4) 

Thus  we  see  that  the  distance  to  the  nearest  ill-posed  problem  is  bounded  below  essentially 
by  a  multiple  of  the  square  of  the  condition  number.  This  is  a  general  phenomenon  among 
algebraic  problems  to  which  we  return  below. 

Eigenvalue  calculations:  We  will  be  interested  both  in  the  sensitivity  of  eigenvalues  and  eigen- 
vectors. More  precisely,  we  will  consider  the  sensitivity  of  the  projection  associated  with  an 
eigenvalue  [16].  If  7"  is  a  matrix  with  simple  eigenvalue  X.,  right  eigenvector  x  and  left  eigen- 
vector y,  the  projection  P  associated  with  A.  is  the  matrix  P  =  xyTlyTx.  The  reduced  resolvent 
associated  with  X  is  the  matrix 


S  =  lim  (I-P)(T-z) 


-  i 


If  T  has  n  distinct  eigenvalues  X,  with  projections  P,  one  can  write 

s  =   2  (x,-x)-1/\  • 

If  8T  is  a  small  perturbation  of  T,  one  can  show  that  to  first  order  X  changes  to  X  +  5X  and  P 
changes  to  P  +  hP  [16]  where 

8X  =  trP87"     and     hP  =  -SbTP  -  PbTS 

It  is  easy  to  verify  that  |5X  |  can  be  as  large  as  ||P||-||8r||  and  ||8P||  can  be  at  least  as  large  as 
||S||-||P||-||87"||  (and  no  more  than  twice  as  large  as  this).  Therefore  we  may  take  as  condition 
numbers 

<(T,\)  m   \\P\\ 

and 

k(j,p)=  ||/»||-||J||-||r||f 

both  of  which  are  homogeneous. 

Both  condition  numbers  are  infinite  when  X  is  a  multiple  eigenvalue.  Thus  we  will  take 
the  set  IP  of  ill-posed  problems  to  be  those  matrices  with  multiple  eigenvalues.  We  may  see 
that  IP  is  a  variety  as  follows.  Let  p  (7",X)  be  the  characteristic  polynomial  of  T.  T  will  have 
multiple  eigenvalues  if  and  only  if  p  has  multiple  zeros,  which  happens  if  and  only  if  the 
discriminant  of  p,  a  homogeneous  polynomial  of  degree  n2-n  in  the  entries  of  T,  is  zero 
(note  that  p  is  monic)  [17,  31].  It  is  not  hard  to  show  that  this  polynomial  is  irreducible  [3]. 
Thus  we  see  that  if  T  is  complex,  IP  is  a  hypersurface.  If  T  is  real,  IP  is  still  a  hypersurface 
[1]. 

We  now  need  to  relate  the  above  condition  numbers  of  T  to  the  distance  from  T  to  IP. 
A  slight  restatement  of  a  theorem  due  to  Wilkinson  states 

Theorem  3.4:  [35]  dist(7",//>)  <  V2-||r||f  /  ||P||. 

Therefore,  in  terms  of  k  we  may  write 

if  ||r||f  =  1,    then   dist(r,//>)  <  -—±-    .  (3.5) 

k(T,\) 

Wilkinson's  theorem  provides  a  somewhat  weak  upper  bound  on  dist(7",//)).  The  condition 
for  P  on  the  other  hand  provides  a  lower  bound  on  dist(T,IP): 

Theorem  3.5:  [4]  dist(7\/P)  >  ||r||f/(7-K(7",/>)). 
This  result  lets  us  write 

if  ||r||F  =  1,    then   dist(r,/P)  s    - -±  —     .  (3.6) 

For  somewhat  stronger  results  and  discussion,  see  [4]. 


The  phenomenon  described  above  for  matrix  inversion,  polynomial  zero  finding  and 
eigenvalue  calculation  is  actually  quite  common  in  numerical  analysis.  It  turns  out  all  the 
above  results  can  be  derived  from  the  same  underlying  principle,  that  the  condition  number  k 
satisfies  one  or  both  of  the  following  differential  inequalities: 

m-K2  <  ||Dk||  <  M-k2    . 

where  Dk   is  the  gradient  of  k.    The  lower  bound  on   ||Dk||    yields   the   upper  bound   on 


dist(r,/P) 

dist(7\//>)  s  1—- 

m-K{T) 

and  the  upper  bound  on  ||Dk||  yields  the  lower  bound 

1 


dht(T,IP) 


M-k(T) 

This  phenomenon  also  appears  in  pole  placement  in  linear  control  theory,  Newton's  method, 
and  elsewhere  [4].  In  the  case  of  algebraic  functions,  one  can  show  that  at  least  for  asymp- 
totically large  condition  numbers,  differential  inequalities  of  the  form 

m-K2  <  ||Dk||  £  Af-K3 

hold,  the  new  upper  bound  on  ||Z?k||  yielding  the  lower  bound  on  dist(7\/P) 

which  is  the  source  of  inequality  (3.4)  above. 

Note  also  that  the  sets  of  ill-posed  problems  are  hypersurfaces  in  our  three  examples 
above.  Other  kinds  of  varieties  are  possible  as  well.  For  example,  polynomials  with  at  most 
m  distinct  zeros  form  a  subvariety  of  the  variety  of  polynomials  with  at  least  one  multiple 
zero,  and  have  codimension  2(n  -  m)  (if  complex)  orn-m  (if  real)  [3].  Since  _/-tuple  zeros 
are  more  sensitive  than  j  -  1-tuple  zeros  [34],  there  is  a  natural  hierarchy  of  of  sets  of  ever 
more  ill-posed  problems,  each  one  forming  a  subvariety  of  the  previous  set.  Similar  com- 
ments apply  to  eigenvalue  calculations  (_/'-tuple  eigenvalues  are  more  sensitive  than  j  —  1-tuple 
eigenvalues)  and  rank-deficient  linear  least-squares  problems  (problems  with  higher  rank 
deficiency  are  more  sensitive  than  ones  with  lower  rank  deficiency).  The  results  of  the  next 
section  apply  to  these  situations  as  well. 


4.  On  Volumes  of  Tubes 

In  this  section  we  state  our  main  volume  estimates.  Proofs  appear  in  section  7  below. 

First  we  consider  complex  varieties.  The  upper  bounds  in  the  following  theorem  are 
obtained  by  generalizing  an  argument  of  Renegar  [23],  who  obtained  the  upper  bound  for 
hypersurfaces: 

Theorem  4.1:  Suppose  M  is  a  complex,  purely  2d-dimensional  variety  in  C^.  Let /(e)  be  the 
fraction  of  the  volume  of  the  unit  ball  in  CA  which  is  within  distance  e£l  of  M.    Then 

/(€)<    ^T{N+    1/2) e2N2^N-l)2N-2d-2.d       (M).e2(N-tf).(l  +  Ne)2d       (41) 

J  ^  '        r(N-d+  V2)T(d+  1/2)  v         '  ev     ' 

If  M  is  a  hypersurface  (d  =N  -  1),  then  this  upper  bound  may  be  improved  to 

/(e)  s;  c2^3-deg(A/)-e2-(l  +  A^€)2(A'-1)    .  (4.2) 

If  M  passes  through  the  origin,  it  is  also  true  that 


(1-e)"  ,,fV-^    T(N~d+  l'^rO/+  1/2)    <  /(e) 

x  v^7  t 


(4.3) 


deg(M)  V-iT  T(N+  1/2) 
If  M  is  a  hypersurface  passing  through  the  origin  this  lower  bound  may  be  improved  to 

</(e)    .  (4.4) 


(1_02»-2e2 


W  deg(M) 


Now  we  specialize  to  the  case  of  M  homogeneous.    In  this  case  the  upper  bound  (4.1) 
may  be  improved  to 

/(e)  <  e2N2(N-l)2N-2d-2-dcg(M)-€2^-d)-(l  +  Ne)2d    .  (4.5) 

The  lower  bound  (4.3)  may  be  improved  to 

Vtt   r(A^+   1/2)  v   '   ' 

If  M  is  also  a  hypersurface,  the  lower  bound  (4.4)  may  be  further  improved  to 


li-Ll —  ==/(e)    . 


(4.7) 


Finally,  we  have  the  following  asymptotic  expression  for  small  e: 

/(e)  =    g)-deg(M)-e2^-^  +  ^(e2^-^)    .  (4.8) 

Thus,  we  have  upper  bounds,  lower  bounds  and  asymptotic  formulae  all  of  which  only 
depend  on  e,  N,  d  and  deg(Af).  All  our  expressions  are  proportional  to  €2(A'_<')>  anci  for 
asymptotically  small  e  differ  only  by  factors  depending  on  the  parameters  N,  deg(Af )  and  d. 
All  these  results  are  new,  except  for  the  upper  bound  (4.2)  for  d  =N  -  1  [23]. 

These  results  can  be  used  to  give  bounds  for  Prob(dist(/?,M)  £  e)  when  M  is  homo- 
geneous and  p  is  uniformly  distributed  on  the  unit  sphere  in  CN: 

Theorem  4.2:  Suppose  M  is  a  complex,  homogeneous,  purely  2^-dimensional  variety  in  C^. 
Let  p  be  distributed  uniformly  on  the  unit  sphere  centered  at  the  origin  in  CN.  Then  for 
d<N-l 

Prob(dist(>,M)  <  e)  s  e^2^- l)2N-2J-2-deg(A/)-e2(A'-<')-(l  +  A'e)2<'    ,  (4.9) 

,-,       \2d    2(N-d)    T(N-d+  1/2)  T(d+  1/2)   _  n     .  ,..  „,     ...         , 

(1-«)M  ^  2*^1^+1/2)  P"b(dKt(p,M)  *  e)  ,  (4.10) 

and  for  asymptotically  small  e 

Prob(dist(/>,M)  <  e)  =    g ~J)  •deg(Af)-€2(A'-rf)  +  0(e2(N-d))    .  (4.11) 

For  hypersurfaces  (d  =N  -  1) 

,2 

(1_e)2A'-2  _L_  <  Prob(dist(p,M)  <  e)  <  e2Ar2-deg(A/)-e2-(l  +  ATe)2(A'-1)        (4.12) 

and  for  asymptotically  small  e 

Prob(dist(p,A/)  s  e)  =  (AT  -  l)deg(A/)e2  +  o(e2)    .  (4.13) 

It  is  estimates  (4.12)  and  (4.13)  we  shall  apply  to  condition  numbers  in  the  next  section. 

Now  we  turn  to  real  varieties.  The  bounds  are  necessarily  looser,  since  a  rf-dimensional 
real  variety  can  have  an  arbitrarily  small  volume;  this  is  in  strict  contrast  to  complex 
varieties,  where  we  can  bound  the  volume  above  and  below  just  in  terms  of  the  degree.  The 
next  theorem  is  due  to  Ocneanu: 

Theorem  4.3:  [21]  Suppose  M  is  a  real,  purely  rf-dimensional  variety  in  IRN.  Suppose  further 
that  M  is  the  complete  intersection  of  the  polynomials  git  .  .  .  ,gN-d-  Let  D  =  max  deg(^,), 

and  /(e)  be  the  fraction  of  the  volume  of  the  unit  ball  in  IRA  which  lies  within  distance  e  of 
M.  Then 


f(e)*2(N-d)     £       E)-(2D«)*    •  (4.14) 


It  appears  that  Ocneanu's  proof  may  be  able  to  be  extended  to  give  an  asymptotic  for- 
mula for /(e),  which  we  state  as  a 

Conjecture:  Suppose  M  is  as  in  Theorem  4.3.  Then  for  asymptotically  small  c 

/(£)  =  vol(M)-eA'^ Niyi'1) +  o^N~d)  (4  15) 

KJ  K     '  (N-d)*dl2T{{N-d)l2)  K        ' 

where  vol(A/)  is  the  ^/-dimensional  volume  of  M. 

Without  any  assumptions  about  complete  intersection,  we  can  compute  a  lower  bound 
for/(e): 

Theorem  4.4:  Suppose  M  is  a  real,  purely  ^-dimensional  variety  in  IRA .  Let  \o\(M[r])  be  the 
rf-dimensional  volume  of  the  subset  of  M  within  distance  r  of  the  origin.  Then 

vol(A/[l-el)     y-„   d  Y{dl2)  T({d  +  \)I2)  T((N-d +  l)/2)         f(, 

deg(M)  27r^  +  1>'2r((^  +  l)/2)  /W  (4"16) 

If  M  is  homogeneous,  voI(A/  [1-e])  may  be  replaced  by  (1  —  e.)d  \o\(M  [1]). 

Note  that  the  ratio  between  the  conjectured  asymptotic  value  in  (4.15)  and  the  lower 
bound  in  (4.16)  depends  only  on  N,  d  and  dcg(M). 

As  before,  we  can  translate  the  estimates  in  the  last  two  theorems  into  estimates  on 
Prob(dist(p,M)  :£  c),  where  p  is  uniformly  distributed  on  the  unit  sphere: 

Theorem  4.5:  Let  M  be  a  real,  purely  ^-dimensional  homogeneous  variety  in  IR^.  Suppose  p 
is  uniformly  distributed  on  the  unit  sphere  in  IR'N .  Then 

vol(Mril)    n     €)d  g*-rf  d  Tjdll)  T((d  +  l)/2)  T((N-d  +  l)/2) 
deg(M)  4ATir<rf+1>/2  r((JV  +  l)/2) 

s  Prob(dist(p,AO  s  e)  (4.17) 

If,  in  addition,  M  is  the  complete  intersection  of  N  —  d  polynomials,  each  of  degree  at  mostD, 
then 


Prob(dist(>,M)  <  e)  s  2{N  -  d)     £       g)-(2De)*  (4.18) 


If  the  conjecture  (4.15)  is  true,  then  this  would  yield  the  following  estimate  for  asymp- 
totically small  e  when  M  is  a  complete  intersection: 

Prob(dist(p,M)  £  e)  =  vol(M)-€A/-d dJ^NI2^ +  o(eN~d)    .     (4  19) 

K       W     J  '  K     >  (N-d)Tidl2r((N-d)/2)  K         } 

Summarizing  these  results  for  the  case  of  a  real,  homogeneous  hypersurface  defined  by 
a  single  polynomial,  we  have 

lejrtff  •(1~C)*~1  €"5v^~  *  Prob(dist(p,M)  *  .)  ^  2|    g)  (2deg(tf)«)*  (4.20) 

and,  for  asymptotically  small  e  (if  the  conjecture  (4.15)  is  true): 

Prob(dist(>,M)  £  e)  =  vol(A/)-e   ^  ~  ^ff^   +  *(«)    •  (4.21) 

IT 

It  is  estimate  (4.20)  we  will  use  to  estimate  the  distribution  of  condition  numbers  of  real 
problems. 
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We  may  explain  these  theorems  intuitively  as  follows.  If  M  is  a  rf-dimensional  surface  in 
IRA,  the  dominating  term  in  the  expression  for  the  volume  of  the  set  of  points  within  distance 
c  of  M  turns  out  to  be  [32]: 

(rf-dimensional  volume  of  M)-(N  -  tf-dimensional  volume  of  a  unit  ball  in  IR"~*)-e    ~d  (4.22) 

Suppose,  for  example,  M  is  a  straight  line  of  length  /  in  IR2.  Then  d  =  \,  N  =  2  and  the  esti- 
mate of  (4.22)  is  /-2-e,  the  area  of  a  rectangle  of  length  /  and  width  2e  centered  on  M.  It 
turns  out  that  even  if  M  is  curved  that  as  long  as  its  radius  of  curvature  everywhere  exceeds 
e,  the  area  of  the  stripe  of  radius  2e  centered  on  M  is  exactly  2/c.  If  Af  is  a  straight  line  of 
length  /  in  IR3,  (4.22)  gives  the  volume  htr-e2  of  the  right  circular  cylinder  of  length  /  and 
radius  e  centered  on  Af.  If  M  is  curved,  this  formula  is  still  asymptotically  correct  for  small  e. 
If  Af  is  a  square  of  side  /  in  IR3,  (4.22)  correctly  gives  the  volume  /2-2-e  of  the  rectangular 
parallelepiped  of  thickness  2e  centered  on  Af .  Again,  bending  Af  does  not  change  the  asymp- 
totic correctness  of  (4.22).  In  fact,  if  Af  is  a  smooth  compact  manifold,  for  sufficiently  small 
e  the  volume  of  the  set  of  points  within  distance  €  of  Af  is  a  polynomial  in  e  with  leading  term 
given  in  (4.22)  [32]. 

It  remains  to  estimate  the  rf-dimensional  volume  of  Af  needed  in  (4.22).  Here  we  make 
use  of  the  fact  that  Af  is  a  variety,  for  there  are  formulae  from  integral  geometry  for  estimat- 
ing the  volume  of  a  set  Af  in  IRV  in  terms  of  the  number  of  points  in  M  f)L,  where  L  is  a 
plane  of  dimension  N  -  d.  For  varieties,  this  number  is  bounded  by  deg(Af ).  In  fact,  if  Af  is  a 
complex  homogeneous  purely  2J-dimension  variety  in  CN,  the  2d-volume  of  the  part  of  Af 
inside  the  unit  ball  is  exactly  deg(M)itN/N\  [30].  No  such  statement  can  be  made  about  real 
varieties,  so  formulae  like  (4.3)  and  (4.8)  cannot  hold  for  real  varieties. 

Open  Problems:  Ocneanu's  proof  of  Theorem  4.3  depends  on  being  able  to  express  the 
real  variety  Af  as  a  complete  intersection.  Not  all  varieties  permit  such  a  representation.  For 
example,  the  3  by  3  real  matrices  of  rank  at  most  1  forms  a  variety  of  codimension  4  but  9 
polynomials  (the  determinants  of  all  2  by  2  submatrices)  are  needed  for  its  definition.  Is 
there  a  bound  for  real  varieties  that  does  not  depend  on  the  property  of  complete  intersec- 
tion? Also,  Ocneanu's  bound  contains  the  factor  DN~d ,  which  by  Bezout's  theorem  [31]  is  a 
possibly  pessimistic  upper  bound  for  deg(Af).  Is  there  a  bound  which  depends  only  linearly 
on  deg(Af)?    More  generally,  is  there  an  upper  bound  which  depends  linearly  on  vol(Af  [1])? 

All  the  asymptotic  expressions  above  depend  on  the  contribution  to  /(c)  from  small 
neighborhoods  of  the  singular  set  of  Af  going  to  zero.  For  complex  varieties,  the  proof  of  the 
upper  bounds  yields  this  fact.  Ocneanu's  proof  appears  to  yield  it  as  well,  leading  us  to  make 
conjecture  (4.15). 

The  lower  bounds  for  /(e)  for  homogeneous  varieties  in  Theorem  4.1  are  independent 
of  deg(Af)  whereas  the  upper  bounds  are  proportional  to  deg(Af).  By  considering  nearly 
overlapping  hyperplanes  through  the  origin,  one  can  see  that  the  lower  bound  cannot  contain 
a  factor  of  deg(Af )  and  so  this  gap  between  upper  and  lower  bounds  must  be  present.  Res- 
tricting to  irreducible  Af  does  not  eliminate  this  gap  (see  section  7  for  discussion).  However, 
the  factor  l/deg(Af)  in  the  lower  bounds  for  nonhomogeneous  varieties  seems  unnecessary; 
can  it  be  removed? 


5.  Computing  the  Distributions  of  Condition  Numbers 

In  this  section  we  apply    our  geometrical  estimates  of  the  last  section  to  compute  the 
distributions  of  the  condition  numbers  discussed  in  section  3. 

Matrix  Inversion:  Applying  estimates  (4.12)  and  (4.13)  to  equation  (3.2)  yields  the  following 
theorem: 
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Theorem  5.1:  Let  A   be  a  random  complex  n  by  n  matrix  distributed  in  such  a  way  that 
A/||A||f  is  uniformly  distributed  on  the  unit  sphere.  Let  k(A)  =  ||A||F-||A  _I||.  Then 

fl-   l/x)2"2-2         „     ._,    ,.->          v         e2n5q  +  n2/x)2"2-2  ,.  ,. 

ii ±^ <  Prob(K(A)  &  x)  <.  l l (5.1) 

2n4x2  x~ 


and 

e)  =    »l»izll  +  ofA.)    .  (5.2) 


Prob(K(A)  3=  *)  =    "^     ,      '    +  0(  — 


x" 


Remark:  The  upper  bound  in  (5.1)  exceeds  the  asymptotic  value  in  (5.2)  by  a  factor  of  only 
about  e2nA/(n2-l)  for  sufficiently  large  x.  However,  even  for  n  =  10,  x  must  exceed  about 
5300  for  the  upper  bound  to  drop  below  1.  For  n  =  100,  x  must  exceed  2.2107  for  the  upper 
bound  to  drop  below  1. 

Applying  estimate  (4.20)  to  equation  (3.2)  yields 

Theorem  5.2:  Let  A  be  a  random  real  n  by  n  matrix  distributed  in  such  a  way  that  A/||A||f  is 
uniformly  distributed  on  the  unit  sphere.  Let  k(A)  =  ||A||F-||A  _1||.  Then 

C^-  y*)"2"1    ^  Prob(K(A)  *  x)  *    i   *k)<T*  (5J) 

where  C>0  is  a  constant  proportional  to  the  volume  of  the  variety  of  singular  matrices  inside 
the  unit  ball. 

Remark:  When  n  =  10  x  must  exceed  4900  for  the  upper  bound  in  (5.3)  to  be  less  than  1. 
More  generally,  for  large  n  x  must  exceed  about  4.93n3  for  the  upper  bound  to  be  less  than 
1.  One  can  prove  this  by  noting  that  the  upper  bound  may  also  be  written  as 
2[(l+2n/x)":  -  1]. 

Other  sets  of  interest  are  matrices  of  rank  at  most  r<n-l.  The  volumes  of  these  sets 
can  also  be  estimated  from  above  and  below  using  Theorem  4.2,  provided  we  can  bound  the 
degree  of  these  varieties.  Bezout's  theorem  [31]  provides  a  possibly  pessimistic  upper  bound 
on  the  degree. 

Polynomial  Zero  Finding:  Applying  estimate  (4.12)  to  inequality  (3.3)  yields  the  following 

theorem: 

Theorem  5.3:  Let  p  be  a  random  complex  n-th  degree  polynomial  distributed  in  such  a  way 

that  p/||p||f  is  uniformly  distributed  on  the  unit  sphere.  Let  tc(>)  =  max  ||p||/|p'(z)  |,  where 

the  maximum  is  over  all  zeros  of  p .  Then 

t,     ,.,    ,    n  x        4e2(n+l)2(n  -!)(!+  Vljn  +  l)/x)2n  ,_  « 

Prob(K(p)  s  x)  <■  * Li ^ * L — *—    •  (5.4) 

xz 

Applying  estimate  (4.20)  to  inequality  (3.3)  yields 

Theorem  5.4:  Let  p  be  a  random  real  n-th  degree  polynomial  distributed  in  such  a  way  that 
p/||/>||f  is  uniformly  distributed  on  the  unit  sphere.  Let  k(jj)  be  as  in  Theorem  5.2.  Then 

Prob(K(p)  >  x)  ^  2  ^    ("  +  1)  (25/2("-l))*    .  (5.5) 

Eigenvalue  calculations:  Applying  estimate  (4.12)  to  inequality  (3.5)  yields 

Theorem  5.5:  Let  A  be  a  random  complex  n  by  n  matrix  distributed  in  such  a  way  that 

A/||A||f  is  uniformly  distributed  on  the  unit  sphere.    Let  kx(A)  -  max  \\PHA)\\  where  the  max 
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is  over  all  eigenvalues  X.(A)  of  A  and  P\(A)  is  the  projection  associated  with  X(A).  Then 

Prob(KX(A)  2=  x)  <  5 " i (5.6) 

x 

Applying  estimate  (4.20)  to  inequality  (3.5)  yields 

Theorem  5.6:  Let  A  be  a  random  real  n  by  n  matrix  distributed  in  such  a  way  that  A/||A||f  is 
uniformly  distributed  on  the  unit  sphere.    Let  k\(A)  be  as  in  Theorem  5.4.  Then 

Prob(K,(A)  ^  x)  <  2  2    W\  (^fe^lgl)*    .  (5  7) 

Applying  estimate  (4.12)  to  inequality  (3.6)  yields 

Theorem  5.7:  A  be  a  random  complex  n  by  n  matrix  distributed  in  such  a  way  that  A/||A||f  is 
uniformly  distributed  on  the  unit  sphere.    Let  kp(A)  ^  max  ||/'^(/,)||-||5x(/i)||'||A||/r,  where  the 

max  is  over  all  eigenvalues  \(A)  of  A,  P\W  is  the  projection  associated  with  X(A),  and  SX(A) 
is  the  reduced  resolvent  associated  with  X(A).    Then 


(1"i/(?r2"2^Prob(KP(A)^) 


(5.8) 


One  can  also  prove  a  lower  bound  on  Vvob(KP(A)  >  x)  for  real  matrices  of  the  form 
C/x,  but  C  is  proportional  to  the  volume  of  the  variety  of  real  matrices  with  multiple  eigen- 
values and  lying  inside  the  unit  ball,  and  seems  difficult  to  estimate. 


6.  Practical  Applications  and  Limitations 

In  this  section  we  show  how  to  estimate  the  distribution  of  the  error  in  results  computed 
by  finite  precision  algorithms  for  the  problems  we  analyzed  above.  The  new  tool  required  is 
backwards  error  analysis  [33];  using  it  we  show  that  except  in  the  improbable  situation  that 
the  problem  to  be  solved  is  close  to  the  set  IP  of  ill-posed  problems,  a  backwards  stable  algo- 
rithm will  supply  an  accurate  answer.  We  analyze  Gaussian  elimination  this  way  in  section 
6.1. 

Such  an  analysis  assumes  problems  are  distributed  uniformly  as  discussed  in  section  1  of 
this  paper.  This  assumption  breaks  down  in  two  important  situations.  First,  some  algorithms 
produce  problems  which  tend  to  lie  very  close  to  the  set  IP  of  ill-posed  problems,  or  which  in 
fact  converge  to  IP.  For  example,  inverse  iteration  to  compute  eigenvalues  and  eigenvectors 
involves  solving  a  sequence  of  linear  equations  with  increasingly  ill-conditioned  coefficient 
matrices.  Another  example  is  the  numerical  solution  of  differential  equations;  the  resulting 
matrices  are  approximations  of  unbounded  operators  and  are  necessarily  close  to  singular. 

Second,  the  set  of  problems  representable  in  a  computer  (in  finite  precision  arithmetic) 
is  necessarily  finite  and  so  any  distribution  we  put  on  this  set  will  necessarily  be  discrete,  not 
continuous  as  assumed  in  our  previous  analysis.  As  long  as  the  discrete  points  are  dense 
enough  to  model  the  continuum  (this  depends  on  the  individual  problem),  the  continuous 
model  is  relevant.  It  will  turn  out,  however,  that  this  discreteness  ultimately  leads  to  qualita- 
tively different  behavior  of  algorithms  than  is  predicted  by  the  continuous  model.  We  discuss 
this  situation  further  in  section  6.2.  (This  limitation  does  not  invalidate  our  analysis  of  Gaus- 
sian elimination  in  finite  precision  arithmetic  in  section  6.1.) 

Finally,  in  section  6.3,  we  discuss  how  this  theory  might  be  extended  to  the  finite  preci- 
sion case  and  what  such  an  extension  would  tell  us  about  the  design  both  of  numerical  algo- 
rithms and  computer  arithmetic.  In  particular,  we  show  how  how  it  would  tell  us  how  many 
finite    precision    problems   we   could    solve    as    a   function   of   the   extra   precision    used   in 


13 


intermediate  calculations.  This  information  would  be  of  use  in  algorithm  and  even  computer 
hardware  design.    Accomplishing  this  extension  is  an  open  problem. 

6.1  A  Paradigm  for  Analyzing  the  Accuracy  of  Finite  Precision  Algorithms 

The  paradigm  for  applying  the  probabilistic  model  to  the  analysis  of  algorithms  is  as  fol- 
lows: 

(1)  Within  the  space  of  problems,  identify  the  set  IP  of  ill-posed  ones,  and  show  that  the 
closer  a  problem  is  to  IP  the  more  sensitive  the  solution  is  to  small  changes  in  the  prob- 
lem. 

(2)  Show  that  the  algorithm  in  question  computes  an  accurate  solution  for  a  problem  close 
to  the  one  it  received  as  input  (this  is  known  as  "backwards  stability"  [33]).  Combined 
with  the  result  of  (1),  this  will  show  that  the  algorithm  will  compute  an  accurate  solu- 
tion to  a  problem  as  long  as  the  problem  is  far  enough  from  IP. 

(3)  Compute  the  probability  that  a  random  problem  is  close  to  IP.  Using  this  probability 
distribution  in  conjunction  with  the  result  of  (2)  we  can  compute  the  probability  of  the 
algorithm  computing  an  accurate  result. 

This  paradigm  is  best  explained  by  applying  it  to  matrix  inversion: 

(1)  The  set  of  matrices  IP  which  are  ill-posed  with  respect  to  inversion  are  the  singular 
matrices.    As  discussed  in  section  3,  the  condition  number 

k(M)  =  ||Jf||,  ■  iim-1!!  (6.1) 

measures  how  difficult  the  matrix  M  is  to  invert,  and  when  ||A/||jr=l  it  is  the  reciprocal 
of  the  distance  to  the  nearest  singular  matrix. 

(2)  Gaussian  elimination  with  pivoting  is  a  standard  algorithm  for  matrix  inversion  and  is 
well  known  to  be  a  backwards  stable  algorithm  [33].  Backwards  stability  means  that 
when  applying  Gaussian  elimination  to  compute  the  solution  of  the  system  of  linear 
equations  Mx  =  b,  one  gets  an  answer  x  which  satisfies  (M  +  hM)x=b,  where  BM  is  small 
in  norm  compared  to  M .  More  precisely,  let  X,  be  the  i-th  column  of  the  approximation 
to  Af_1  computed  using  Gaussian  elimination,  where  the  arithmetic  operations  per- 
formed (addition,  subtraction,  multiplication,  and  division)  are  all  rounded  off  to  b  bits 
of  precision.  Then  X,  is  the  value  of  the  j'-th  column  of  the  inverse  of  a  matrix  A/  +  8M, 
where  8M,  is  small: 

\\lMt\\F*f(n)-2->-\)ai\\F  (6-2) 

where  f  (n)  is  a  function  only  of  n,  the  dimension  of  M  [33].  The  magnitude  of  /(«) 
depends  on  the  pivoting  strategy,  and  can  be  as  large  as  2"  if  partial  pivoting  is  used, 
although  this  is  very  rare  in  practice  [33].  /(«)  is  much  smaller  if  either  complete  pivot- 
ing is  used  or  if  we  substitute  the  QR  algorithm  for  Gaussian  elimination  [10].  For  our 
analysis,  however,  we  are  not  interested  in  how  big  /(n)  is,  only  that  inequality  (6.2) 
holds.  Inequality  (6.2)  can  be  used  to  bound  the  relative  error  in  the  computed  solution 
[33]: 

II*  -  M-%   ^    V^k(M)  -f(n)  -2-b 
\\M-l\\F  1  -  k(M)  ■/(/!)  -2 

In  other  words,  as  long  as  the  bound  (6.2)  on  ||8Af,||f  is  not  so  large  that  M  +  8W,  could 
be  singular,  i.e.  as  long  as 

dist(W  ,  IP)  >  f(n)  ■  2~b  ■  \\M\\F 

or,  substituting  from  (6.1) 
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(6.3) 


k(M)  <  2blf{n)  ,  (6.4) 

then  the  relative  error  in  the  computed  inverse  X  is  bounded,  and  the  smaller  k(M)  is 
the  more  accurate  is  the  solution. 

(3)  Assuming  M  is  complex  we  can  apply  Theorem  5.1  (which  gives  the  probability  distribu- 
tion of  the  condition  number)  to  estimate  the  probability  that  a  random  matrix  can  be 
inverted  accurately: 

PL^pk  s  e)  a  prob(^K(M)-/(^)-2- 
\M-l\\F  Vl  -  k(M)  •/(«)  -2" 


Prob("A        "       UF   *  c)  *  Prob(   '"K«     •/    n     -^        s  c)  (fi  5) 


-b 


which,  after  some  rearrangement  (and  assuming  e<l)  equals 

=  Prob(K(A/)  <  -4- -) 

/(/i)  -(Vn  +  €)2-6 

s  1  -    (e2n5(l+  n2f(n)C^  +  €)-(2~b  fe))2n2~2  /2(n)(V^~  +c)2)- 

=  1  -  g(n,e,b)-(2-b  /e)2    . 

The  g(n,t,b)  factor  depends  only  weakly  on  €  and  b;  the  interesting  factor  is  (2~fc/c)2. 
This  inequality  implies  that  as  we  compute  with  higher  and  higher  precision  (b  increases),  the 
probability  of  getting  a  computed  answer  with  accuracy  c  goes  to  1  at  least  as  fast  as 
l-0(4-i).  Note  that  the  inequality  only  makes  sense  for  2~b/e  small,  that  is  if  the  error  2~b 
in  the  arithmetic  is  smaller  than  the  error  e  demanded  of  the  answer.  This  restriction  makes 
sense  numerically,  since  we  cannot  expect  more  precision  than  we  compute  with.  The  restric- 
tion also  implies  that  the  finite  precision  numbers  are  sufficiently  dense  to  approximate  the 
continuum,  since  the  radius  r  of  the  neighborhood  around  IP,  r  =  /(«)  ( ^  n  +  e)2-fc/€,  is 
much  larger  than  the  distance  between  adjacent  finite  precision  points  2~b.  This  situation  is 
depicted  in  Figure  1  and  discussed  in  the  next  section. 

We  may  use  the  same  kind  of  paradigm  as  discussed  so  far  to  analyze  the  speed  of  con- 
vergence of  an  algorithm  rather  than  its  accuracy.  In  this  case  the  paradigm  is 

(1')    Identify  the  ill-posed  problems  IP. 

(2')    Show  that  the  closer  a  problem  is  to  IP,  the  more  slowly  the  algorithm  converges. 

(3')    Compute  the  probability  that  a  random  problem  is  close  to  IP.    Combined  with  (2')  this 
yields  the  probability  distribution  of  the  speed  of  convergence. 

This  approach  has  been  used  by  Smale  [26]  and  Renegar  [23]  in  their  average  speed 
analyses  of  Newton's  method  for  finding  zeros  of  polynomials. 

6.2  Limitations  of  the  Probabilistic  Model 

In  this  section  we  discuss  limitations  to  the  applicability  of  our  model.  As  mentioned 
before,  the  model  does  not  apply  in  situations  where  the  problems  tend  to  be  clustered  about 
the  ill-posed  problems.  One  such  example  is  inverse  iteration  for  computing  the  eigenvalues 
and  eigenvectors  of  a  matrix: 

Xi+i  =  (A  -A,)"'  Xj 

X.,  +  i  =  (Ax,  +  iy/xj  +  1    where  |xf+i|  =  max  |xf+1  |    . 


* 


If  X.,  is  a  good  approximation  to  the  simple  eigenvalue  X,  and  Xj  approximates  the 
corresponding  eigenvector  x,  then  \,  +  i  and  x/+j  will  be  even  better  approximations  to  X  and 
x.  As  X,  approaches  X,  the  matrices  A -A,  become  increasingly  ill-conditioned.  Thus,  the  set 
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of  matrices  {A-X,}  being  (conceptually)  inverted  (actually,  one  solves  (A -X,)x,  +  1  =x, 
directly)  converges  to  the  set  IP  of  ill-posed  problems,  and  so  is  far  from  uniformly  distri- 
buted. This  invalidates  the  assumption  of  the  model,  even  in  exact  arithmetic.  In  finite  pre- 
cision arithmetic,  inverse  iteration  works  very  well,  even  though  naive  backwards  error 
analysis  as  in  section  6.1  might  lead  us  to  expect  total  loss  of  precision.  This  is  because  the 
rounding  errors  committed  while  solving  (A  -X,)x,^]  =x,  provably  conspire  to  produce  an 
error  lying  almost  certainly  in  the  direction  of  the  desired  eigenvector  [10]. 

The  second  way  in  which  the  model  breaks  down  depends  on  the  ultimate  discreteness 
of  the  finite  precision  numbers  which  can  be  represented  in  a  computer.  The  natural  version 
of  a  "uniform  distribution"  in  this  case  is  simply  counting  measure.  The  continuous  model  is  a 
good  approximation  to  counting  measure  only  as  long  as  the  finite  precision  numbers  are 
dense  enough  to  resemble  the  continuum.  In  Figure  1,  for  example,  the  area  of  the  set  of 
points  within  distance  r  of  the  curve  IP  is  a  good  approximation  to  the  number  of  dots  (finite 
precision  points)  within  distance  r  of  IP  (scaled  appropriately).  This  is  true  because  the  radius 
r  of  the  neighborhood  of  IP  is  large  compared  to  the  spacing  2~h  between  dots.  When 
r<2~b  on  the  other  hand  as  in  Figure  2  the  area  of  the  set  of  points  within  distance  r  of  IP  is 
not  necessarily  a  good  approximation  of  the  number  of  dots  within  r  of  IP.  For  example,  if 
IP  were  a  straight  line  passing  exactly  half  way  between  two  rows  of  dots,  there  would  be  no 
dots  within  distance  2~b~]  of  IP.  If  on  the  other  hand  IP  were  a  straight  line  running  along  a 
row  of  dots,  there  would  be  a  constant  nonzero  number  of  dots  within  distance  tj  of  IP  for  all 
T)<2_i.  Thus,  when  the  radius  of  the  neighborhood  of  IP  get  smaller  than  the  interdot  dis- 
tance 2~b ,  the  model  breaks  down. 

Specifically,  let  us  consider  matrix  inversion.  In  the  continuous  model  the  exactly  singu- 
lar matrices  form  a  set  of  measure  zero,  so  the  chance  of  a  random  problem  being  singular  is 
zero.  Also,  there  are  nonsingular  matrices  arbitrarily  close  to  the  set  of  singular  ones,  and  so 
of  unbounded  condition  number.  Consider  now  the  finite  (but  large)  set  of  matrices  which 
can  be  represented  in  a  computer  using  finite  precision  arithmetic.  Some  fraction  of  this  finite 
set  are  exactly  singular,  so  in  choosing  one  member  of  this  finite  set  at  random  (using  count- 
ing measure)  there  is  a  nonzero  probability  of  getting  an  exactly  singular  matrix.  Further- 
more, the  remaining  nonsingular  matrices  have  condition  numbers  bounded  by  some  finite 
value  K.  Thus,  instead  of  Prob(x(A)  s  x)  decreasing  monotonically  to  0  as  x  increases  as  in 
the  continuous  case,  Prob(i<(A)  s=  x)  becomes  constant  and  nonzero  for  x>K.  This  is  clearly 
significantly  different  behavior.  It  does  not,  however,  invalidate  the  analysis  of  Gaussian 
elimination  in  the  last  section,  because  we  assumed  2~*<r,  i.e.  the  situation  in  Figure  1. 

In  the  next  section  we  discuss  what  we  could  do  if  we  could  compute  Prob(it(A)  s  x)  in 
the  discrete  case  for  all  x,  in  particular  for  x  too  large  for  the  continuous  approximation  to 
apply. 
6.3  How  to  Use  the  Discrete  Distribution  of  Points  Within  Distance  e  of  a  Variety 

Before  proceeding,  we  need  to  say  what  probability  measure  we  are  going  to  put  on  the 
discrete  set  of  finite  precision  points.  The  last  section  showed  that  no  single  distribution  is 
good  for  all  applications,  but  a  uniform  distribution  remains  a  neutral  and  interesting  choice. 
So  far  we  have  been  implicitly  using  fixed  point  numbers,  in  which  case  assigning  equal  pro- 
bability to  each  point  (counting  measure)  gives  a  uniform  distribution.  For  floating  point 
numbers,  however,  this  is  no  longer  appropriate  since  the  floating  point  numbers  are  not 
evenly  distributed  on  the  number  line.  Since  floating  point  numbers  are  much  closer  together 
near  the  origin  than  far  away  from  it  (the  distance  between  adjacent  numbers  is  approxi- 
mately a  constant  times  the  number),  counting  measure  would  assign  much  more  probability 
to  equal  length  intervals  near  the  origin  than  far  away  from  it.  A  simple  way  to  adjust  for 
this  nonuniform  spacing  is  to  assign  to  each  point  M  a  probability  proportional  to  the  volume 
of  the  small  parallelepiped  of  points  which  round  to  M  (i.e.  the  parallelepiped  centered  at  M 
with  sides  equal  in  length  to  the  distance  between  adjacent  finite  precision  points).    In  the 
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Figure  1.    An  r>2       neighborhood  of  IP 
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case  of  fixed  point  arithmetic,  this  just  reproduces  counting  measure,  whereas  with  floating 
point  arithmetic  points  near  0  have  smaller  probability  than  larger  points,  and  intervals  of 
equal  length  have  approximately  equal  probabilities.  Actually,  the  question  of  the  distribu- 
tion of  the  digits  of  a  floating  point  number  has  a  large  literature  [18,  12,  2],  but  the  discus- 
sion in  this  section  does  not  depend  strongly  on  the  actual  distribution  of  digits  chosen. 

This  discussion  does  assume  that  the  finite  precision  input  is  known  exactly,  i.e.  that 
there  is  no  error  inherited  from  previous  computations  or  from  measurement  errors.  In  gen- 
eral there  will  be  such  errors,  and  they  will  almost  always  be  at  least  a  few  units  in  the  last 
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place  of  the  input  problem.  In  other  words,  there  already  is  a  ball  of  uncertainty  around  the 
input  problem  with  a  radius  equal  to  a  small  multiple  of  the  interpoint  distance  2~  °.  There- 
fore, it  may  make  no  sense  to  use  higher  precision  to  accurately  solve  problems  lying  very 
close  to  IP  when  the  inherited  input  error  is  so  large  that  the  true  answer  is  inherently  very 
uncertain.  In  such  situations  programmers  sometimes  shrug  and  settle  for  the  backwards  sta- 
bility provided  by  the  algorithm,  even  if  the  delivered  solution  is  entirely  wrong,  because  the 
act  of  solution  has  scarcely  worsened  the  uncertainty  inherited  from  the  data,  and  the  pro- 
grammer declines  to  be  held  responsible  for  the  uncertainty  inherent  in  the  data.  Neverthe- 
less, getting  an  accurate  answer  for  as  many  inputs  as  possible  is  a  worthwhile  goal,  so  we 
will  not  concern  ourselves  with  possible  errors  made  in  creating  the  input  matrices. 

We  claim  that  knowing  the  probability  distribution  of  the  distance  of  a  random  finite 
precision  problem  to  the  set  IP  of  ill-posed  problems  will  tell  us  how  many  finite  precision 
problems  we  can  solve  as  a  function  of  the  extra  precision  used  in  intermediate  calculations. 
As  mentioned  before,  programmers  often  resort  to  extra  precision  arithmetic  to  get  more 
accurate  solutions  to  problems  which  are  given  only  to  single  precision.  This  extra  precision 
has  a  cost  (in  speed  and  memory)  dependent  on  the  number  of  digits  carried,  so  program- 
mers usually  avoid  extra  precision  unless  persuaded  otherwise  by  bad  experiences,  an  error 
analysis,  or  paranoia.  Therefore  an  accurate  estimate  of  how  many  problems  can  be  solved  as 
a  function  of  the  extra  precision  used  would  not  only  help  programmers  decide  how  much  to 
use  but  possibly  influence  hardware  designers  when  they  decide  how  much  precision  to  make 
available  in  their  computer  systems. 

How  does  knowledge  of  this  probability  distribution  tell  us  how  much  extra  precision  to 
use?  The  paradigm  in  section  6.1  tells  us  how.  Consider  matrix  inversion.  Formula  (6.3) 
tells  us  that  using  fixed  point  arithmetic  of  accuracy  2~b  permits  us  to  compute  inverses  of 
matrices  to  within  accuracy  e  as  long  as  their  condition  numbers  are  less  than 
c/(/(n)( vn  +e)2~fc).  Suppose  we  choose  our  problems  at  random  from  the  set  of  matrices 
with  fco-0'1  entries,  and  let  ProbfcQ(K(A/)  a  x)  be  the  discrete  distribution  function  of  the  con- 
dition number.  Then 

Nb0(b)  =  l-Probfc0(K(M)  a  '  ) 

/(/»)(V„+€)2 

bounds  from  below  the  fraction  of  b0-b'it  matrices  we  can  invert  with  accuracy  6  as  a  function 
of  the  number  of  bits  b>b0  carried  in  the  calculation.  By  examining  NbQ(b)  as  a  function  of 
b,  one  can  decide  exactly  how  much  improvement  one  gets  for  each  additional  bit  of  preci- 
sion b.  For  example,  we  know  from  the  previous  discussion  that  there  is  a  b  such  that  when 
6>fc,  Nbo(b)  is  constant  and  nonzero.  Therefore,  it  clearly  does  not  pay  to  increase  b 
beyond  b. 

We  close  with  an  example  of  the  discrete  distribution  Probfc0(K(A/)  a  x).  Consider  the 
rather  simple  problem  of  inverting  real  2  by  2  matrices.  This  problem  is  small  enough  that 
we  can  exhaustively  compute  Probfco(i<(M)  a  x)  for  low  precision  arithmetic.  We  have  done 
this  for  b0=  3,  4,  5,  6  and  7  (all  numbers  lay  between  0  and  1  in  absolute  value,  and  each 
fixed  point  matrix  was  assigned  the  same  probability).  Let  P  (r)  =  Probfco(i<(A/)  a  1/r).  We 
recall  that  in  the  continuous  case  (Theorem  5.2)  P(r)  would  be  approximately  a  linear  func- 
tion of  r.  For  all  values  of  b0  tested,  we  observed  approximately  the  behavior  of  P  (r)  shown 
in  Figure  3.  Surprisingly,  we  observed  linear  dependence  of  P  (r)  on  r  not  only  for  r  larger 
than  2~b°  (corresponding  to  Figure  1)  but  for  r  quite  a  bit  smaller  than  2~  °  (Figure  2).  The 
fraction  of  problems  within  2  °  of  a  singular  matrix  was  about  2  .  This  linear  behavior 
of  P  (r)  continued  until  r  reached  approximately  2_2b°,  and  there  the  graph  of  the  distribution 
became  horizontal  and  remained  so  all  the  way  to  the  origin,  intersecting  the  vertical  axis  at 
about  22~2b°.  This  means  that  all  matrices  closer  to  IP  than  approximately  2~  °  were 
exactly  singular.  The  fraction  of  matrices  which  were  exactly  singular  was  2     "  °. 
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Prob(dist(M,IP)<r) 


,l-i 


>2-2i 


Figure  3.    Observed  probability  distribution  of  the  distance  r  to  the  nearest  singular  matrix. 

What  does  this  tell  us  about  the  use  of  extra  precision?  Basically,  as  long  as  the  distribu- 
tion function  P{r)  remains  linear,  it  says  that  for  every  extra  bit  of  intermediate  precision,  we 
can  solve  half  the  problems  we  couldn't  solve  before.  This  regime  continues  until  we  reach 
double  precision,  at  which  point  the  only  problems  we  can't  solve  are  exactly  singular. 
Indeed,  since 

1  -i 

=  (ad-bc)'1 


d     -c 
-b      a 


we  can  clearly  compute  the  inverse  accurately  if  we  can  compute  the  determinant  ad -be 
accurately.  Since  a,  b,  c  and  d  are  given  to  single  precision,  double  precision  clearly  suffices 
to  compute  ad  -  be  exactly. 

What  if  the  discrete  distribution  function  were  similar  for  matrices  of  higher  dimen- 
sions, that  is  linear  for  a  while  and  then  suddenly  horizontal  when  all  worse  conditioned 
matrices  were  exactly  singular?  It  would  again  tell  us  that  for  a  while,  every  extra  bit  of 
intermediate  precision  would  let  us  solve  half  the  problems  we  couldn't  solve  before.  Eventu- 
ally, after  enough  extra  bits,  (and  for  inverting  fixed  precision  n  by  n  matrices,  this  clearly 
occurs  no  later  than  reaching  n-tuple  precision),  all  finite  precision  matrices  which  are  not 
exactly  singular  could  be  inverted,  and  more  precision  would  contribute  nothing.  Thus  a  pro- 
grammer (or  hardware  designer)  could  choose  the  number  of  bits  b  with  which  to  compute  in 
order  to  guarantee  that  the  fraction  of  unsolvable  problems  is  sufficiently  close  to  its 
minimum.  Of  course,  exhaustive  evaluation  of  the  distribution  function  is  not  reasonable  for 
large  problems,  and  estimating  the  distribution  function  becomes  an  interesting  open  question 
of  number  theory. 


7.    Proofs  of  Volume  Estimates 

In  this  section  we  present  the  proofs  of  the  volume  estimates  of  section  4. 
to  the  notation  of  section  2,  we  will  use 


In  addition 
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°"  -    r((n  +  l)/2)  "  "       »  «I>/2)     ' 

0„  is  the  surface  area  of  the  n-dimensional  unit  sphere  and  6„  is  the  volume  of  the  n- 
dimensional  unit  ball  [25].  B  (p,r)  is  the  open  unit_ball  of  radius  r  centered  at  p,  and  B  (r)  is 
centered  at  the  origin.  If  M  is  any  set,  M[r]  =  M  f)B(r).  If  M  is  a  variety  NS  (M)  will  denote 
the  nonsingular  part  of  M  (a  manifold)  and  S  (M)  the  remaining  singular  part  (a  lower  dimen- 
sional subvariety).  vol(M)  or  voldim(W)(Af )_will  denote  the  dim(A/)-dimensional  measure  of 
NS(M).    T(M,z)  is  the  set  of  points  inside  S(l)  within  distance  €  of  M: 

T(M,t)  =  {z:\\z\\  <:  1  ,dist(z,M)  <  c} 

If  MCJRN,  f(M ,e)  =  volA'(r(M,e))/eA',  the  fraction  of  the  unit  ball  within  e  of  M.  #  (5)  will 
denote  the  cardinality  of  the  discrete  set  S. 

We  will  need  the  following  estimates  on  the  volumes  of  varieties  inside  balls: 
Lemma  7.1:    [30].    Let  M  be  a  purely  2d-dimensional  homogeneous  complex  variety  in  CN . 
Then  vol2d(M[r])=deg(A/)-e2lf-'-W- 

Lemma  7.2:  [29,  Thm  B].  Let  M  be  a  purely  2<f-dimensional  complex  variety  containing  the 
origin.    Then  \o\2d(M  [/"])& §ld-rld  . 

Lemma    7.3:      [23,     Prop.     5.3].      Let    M    be    a    complex    hypersurface    in    CN .      Then 

vol2N-2(A/[r])sdeg(A/)-Af-e2A_2-''2A"2- 

Lemma  7.4:    Let  M  be  a  purely  J-dimensional  real  variety  in  IRA .    Then 

vol,(A/[r])<deg(M)-^^-e,-rd  . 

Proof:  That  \o\d(M[r])  is  finite  follows  from  [6,  sec.  3.4.10].  Let  LN-d  denote  an  N-d- 
dimensional  plane  in  IRA .  dL^-d  will  denote  the  kinematic  density  on  this  set  of  planes  [25, 
chap.  12].  From  [25,  eq.  12.38]  we  may  write  dLN-d—  dad  A  dLd^,  where  dLdm  is  the 
kinematic  density  on  d-planes  through  the  origin  and  da d  is  the  volume  element  on  Ld^. 
This  corresponds  to  parameterizing  a  plane  LN-d  by  the  perpendicular  plane  through  the  ori- 
gin Lj[0]  which  it  intersects,  and  where  it  intersects  it.    From  [25,  eq.  14.70]  we  may  write 

/  *{M[r]p[LS-d)dLS-d=    °N'  "°*  +  X  vo\d{M[r]) 


M[r)C\LN-d*<Z 

or 


0N. 


voWAf  [r])  =       X~"  '     "      '    =  J#(M[r]nLN-d)dLN-d 
l/jv  "  "  "  fJd+i 

1  -dcg(M)-    /    dddAdLd[0] 


0N     ■  ■  Od+ 


d  +  \  ||x||  sr 

where  x  is  the  intersection  point  of  LN-d  and  Ld[0].  In  other  words,  M [r]  and  LN-d  can  inter- 
sect only  if  LN-d  passes  within  distance  r  of  the  origin.  The  last  displayed  expression  in  turn 
equals 

— ^ — - — L-deg(M)-    /  d<jd-fdLd[0] 

ON   •   •   •   L>d  +  \  \\x\\£r 

Off-d  ■  •  •  Ox    J __,wS    d  „     On-\  '  '  '  Os-d 


dcg(M)-r"-Qd- 
On  ■  ■  •  Od+\  Od- 
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where  we  have  used  [25,  eq.  12.35].     o 

By  considering  a  purely  2^-dimensional  complex  variety  in  CN  as  a  real  variety  in  IR;A, 
the  last  lemma  implies 

Corollary  7.5:    Let  M  be  a  purely  2d-dimensional  complex  variety  in  CN .   Then 

vol2J(A/[r])  <  deg(Af) — Q2d-r~d  ■ 

£U2N 

Remark:  In  the  case  of  hypersurfaces  (d=N-l),  Corollary  7.5  yields  the  bound 
deg(A/)(2A' -  l)-e2N-2"''2A_2.  which  is  about  a  factor  of  2  weaker  than  Lemma  7.3.  This  is 
because  the  proof  of  Lemma  7.4  takes  no  advantage  of  the  complex  analytic  structure  of  M. 
One  might  be  able  to  improve  Corollary  7.5  by  taking  advantage  of  this  structure. 

Lemma  7.6:  Let  M  be  a  purely  2rf-dimensional  complex  variety  in  CN .  Let  s=N  if  M  is  a 
hypersurface  (d  -N  -  1)  and  5  =  02N-2d02d/(202N)  otherwise.    Then 

voWWO)  *  Vd°l2^[1?])  -e^*"-*  •  (7.i) 

dcg(M)-s-Q2d 

Proof:  Following  [25,  chap.  15]  we  will  let  dK  be  the  kinematic  density  for  the  group  of 
Euclidean  motions  in  C'V=IR2A.  dK  may  be  written  dA  t\db,  where  dA  is  a  density  on  the 
special  orthogonal  group  SO(2N),  db  is  Lebesgue  measure  on  IR2A',  and  K(s)=As  +b.  From 
[25,  eq.  15.20]  we  have 

/  vol2d(M[l-t]r\K(B(<-)))dK  =  O^-i  •  •  •  Orvolw(Af  [1- e])-vol2N(B  (e)) 

Mii-iifiAJw)^ 

where  K  (B  (c))  is  the  Euclidean  motion  K  applied  to  B(e).  (Actually,  Santalo  only  proved 
this  theorem,  as  well  as  others  of  his  we  shall  use,  for  closed  manifolds.  The  same  results  for 
rectifiable  surfaces  (including  varieties)  are  due  to  Federer  [6,  Thm.  3.2.48  and  Sec.  3.4.10]. 
We  will  continue  to  use  Santalo's  formulation  of  the  result  because  it  is  more  convenient.) 
Since  vol2A(fi  (e^G^e2*  and  \o\2d(M[l-e]f)K(B  (e)))^dcg(M)-s-Q2d-e2d  by  Lemma  7.3 
and  Corollary  7.5,  we  have 


(7.2) 


(7.3) 


deg(M)-s-Q2d-z2d-  /  dK^02N-l  ■  •  •  Oi-\ol2d(M  [l-e])-B2N-e2N 

M[i-€]n*(fl(*))*0 

Since  B  (e)  is  a  ball,  K  (B  (e))  =B  (b,  e)  so 

/  dK  =  /  dA  l\db 

f  dbfdA  =  Oiv-,  •■•(?!■  /  db 

M[\-i)r\B(b,t)*0  M[\-t](~\B(b,(.)±0 

where  we  have  used  [25,  eq.   12.11].    Now  J"  db  is  the  volume  of  the  set  of 

M[]-t]OB(b.t)*0 

points  within  distance  e  of  A/[l  —  e],  which  is  included  in  7"(M,e).    Therefore 

/  ^<vol^(r(M,e))  (74) 

M[l-t]C\B(b,t)*0 

Combining  (7.2),  (7.3)  and  (7.4)  yields  the  result,     o 

Theorem  7.7:    Let  M  be  a  purely  2rf-dimensional  complex  variety  in  CA  containing  the  origin. 
If  J<A^-1  then 
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l-e^c2(A'-</)  20M 


C\-eYdf2i*     ">  1<->2N 

(1     .€)    * 7: ^r-^/CM.e)  (7.5) 

deg(M)  02N-2d02d 


If  d  =  N  -1  (M  a  hypersurface)  then 


Iklll 


2N-2€2 


^•deg(M)       -/(M'£)  ™ 

If  M  is  homogeneous,  the  lower  bounds  in  both  (7.5)  and  (7.6)  may  be  increased  by  the  fac- 
tor deg(Af). 

Proof:  Both  (7.5)  and  (7.6)  follow  from  dividing  (7.1)  by  Q2s  a°d  using  the  estimate 
vo\2d(M[l- e])^Q2d(l  —  €)  from  Lemma  7.2.  For  homogeneous  M  use  Lemma  7.1 
instead.     □ 

The  proof  of  Lemma  7.6  depended  on  the  complex  analyticity  of  M  only  in  the  estimate 
\o\2d(M  [I-  t]PiK  (B  (*)))< deg(M)-s-Q2d-*2d ,  based  on  Lemma  7.3  and  Corollary  7.5.  If  M 
is  a  <i-dimensional  real  variety  in  IRA  we  can  use  Lemma  7.4  to  instead  estimate 

voWA/[l-e]n*(S(0))  «  deg(My°"-dn0d  ■e,-*"  . 

z- Ca- 
using this  bound  in  the  proof  of  Lemma  7.6  yields 
Lemma  7.8:    Let  M  be  a  purely  d-dimensional  real  variety  in  IRA .    Then 

volli(M[l-e])         20v  w     . 

volA-(r(Af,e))  ^  '„     '         -7; V'6"''  (7-7) 

deg(Af)-6d       0N-d-Od 

This  immediately  yields 

Theorem  7.9:    Let  A/  be  a  purely  ^-dimensional  real  variety  in  IRA .    Then 

voUA/[l-e])        20N         .,     . 

^V1 — " e  *  /(W,e)  (7.8) 

deg(M)-6J       OawOj  M        ^  k       j 

If  M  is  homogeneous  then 

(l-e)dvolJ(Af  fll)        20A-         .,    A  , 

,  L  J €"-*</ (A/, e)  (7.9) 

deg(Af)-erf  0N.dOd  K      } 

Proof:  (7.9)  follows  by  dividing  (7.7)  by  6,v.  When  M  is  homogeneous 
\o\d(M  [r])  =  r  dvold(M[l]).     □ 

Remark:  This  theorem  has  the  unhappy  property  that  when  M  is  concentrated  near  the  boun- 
dary of  fl(l),  the  lower  bound  degenerates  to  0  since  \o\d(M  [l-e])=  0.  Consideration  of 
several  examples  of  this  kind  lead  us  to  conjecture  that  voL;(Af  [1-c])  may  be  replaced  by 

\o\d(M 111) r; L rr  ^  r, 

K     l  ^(l  +  €)A-(l-e)A  2A 

where  €Sl. 

This  completes  all  our  lower  bounds.  The  lower  bounds  in  Theorems  4.1  and  4.4  fol- 
low from  substituting  for  O,  and  6;  in  Theorems  7.7  and  7.9. 

We  now  turn  to  upper  bounds.  The  following  theorem  generalizes  a  technique  used  in 
[23,  Prop  5.4]: 

Lemma  7.10:    Let  M  be  a  purely  2</-dimensional  complex  variety  in  CN.    Let 
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1  if  M  is  homogeneous 

N  if  M  is  a  nonhomogeneous  hypersurface  (d=N  —  l)       (7.10) 

OiN-iOidlQOis)  otherwise 


Then 


vol2JV(7-(Af,e))  £  e2N2(N-l)2^'-d-i)-s-deg(M)-e2N-(l  +  N€)2d-^N- 


<D 


(7.11) 


Proof:  Let  dK  be  the  kinematic  density  for  the  group  of  Euclidean  motions  as  in  the  proof  of 
Lemma  7.6.    Then  from  [25,  eq.    15.20]  we  have 

fvol2d(M[l  +  Ne]f)K(B(N<i)))dK  =  02N-X  ■  •  ■  d-vol^B  (tfe))-volM(M  [1  +  tfe])  (7.12) 

where  the  integral  is  over  all  motions  K  such  that  M  [l  +  Ne]f")Af  (B  (Ne))±  0.  Since 
vol2N(B(A/e))  =  e2A-(Ar€)2A'  and  \o\2d(M  [1  +  Ne])sdeg(M)-Q2d-(l  +  Ne)2d -S  by  Lemmas  7.1,  7.3 
and  7.4,  we  have 

Jvo\2d(M[l  +  Ne]C\K(B  (Nt)))dK  <  02JV-i  ■  ■  "  Ox-ti^N2"*2" -deg(M)-e2d-(l  +  Nt)2d-s  (7.13) 

Recall  that  K(s)  =  As  +  b,  where  AeS0(2AO,  and  dK  =  dA  ^  db.  Thus  K  (B  (Ne))  =  B  (b,Ne). 
Then  if  dist(Z?,M  [l])^e,  not  only  do  M[l  +  Ne]  and  K(B(N€))  intersect,  but  their  intersection 
contains  M  [1  +  N^f^B  (p ,  (N  -  l)e)  for  some  p  €M[1]  and  \\b  -p  ||<e.  If  in  addition  ||fc||sl, 
B(p,  (N-  l)e)Cfl(H-A'e)  and  so  by  Lemma  7.2 

volM(Af[l  +  tfe]n*(*(tfO))*eM-((tf  -l)€)w. 


and  so 


Jvolw(Af  [l  +  //€]n^(S(^6)))^^  s 

=  e2,((yv-i)e)w 


/       e2rf((Ar-i)€)MrfA: 


dist(fc.A/[l])£* 
llillsl 


; 


<m: 


dkt(fc,M[l]Se) 
Bills  1 


=  02A._,  •  •  •  Oi-ew((tf-l)e)M-vol2A,(r(Af,€)) 


(7.14) 


where  we  have  used  [25,  eq.  12.11]. 
Combining  (7.13)  and  (7.14)  yields 


voi2*(r(A#,€))  s 


A' 


2JV 


-U2d 


(N-l) 


N 


■s-dcg(M)-62N-(l  +  Nt)2d-e2^'-d) 


N-l 


2N-2 


.iy2.(iV-l)2(JV-rf-i).J.deg(W).e2Ar.(l+JV£)2rf6W-'0 


s  ^•A'2-(^-l)2^-rf-I)-^deg(A/)-62A-(l  +  Ar€)W€2(A'-rf> 

as  desired.     D 

Dividing  by  62#  immediately  yields 
Theorem  7.11:    Let  M  be  a  purely  2^-dimensional  complex  variety  in  CA.    Let  s  be  defined  as 
in  Lemma  7.10.    Then 

/(M,e)  <  *2A'2(A'-l)2<"-<'-1>-j-deg(AO-(l+  Ne)-^"-^ 
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The  upper  bounds  in  Theorem  4.1  follow  immediately  from  Theorem  7.11.  This  com- 
pletes our  proofs  of  upper  bounds. 

Note  that  for  complex  homogeneous  varieties  the  upper  bound  in  Theorem  7.11  is  pro- 
portional to  deg(A/)  whereas  the  lower  bound  in  Theorem  7.7  is  independent  of  deg(M).  This 
gap  is  unavoidable  as  the  following  example  shows.  Suppose  M  consists  of  deg(Af)  hyper- 
planes  through  the  origin  and  tilted  just  slightly  away  from  one  another.  For  any  c0>0  the 
planes  may  be  chosen  so  close  together  that  the  set  of  points  within  €0  of  any  one  overlaps 
almost  completely  with  the  set  of  points  within  €0  of  any  other.  Therefore  the  volume  of 
T(Af,€0)  will  be  insignificantly  larger  than  for  one  hyperplane  for  all  e&€0.  Thus,  no  lower 
bound  can  be  proportional  to  deg(M).  In  this  example  M  is  reducible;  since  irreducible  homo- 
geneous polynomials  in  at  least  three  variables  are  dense  in  the  set  of  all  homogeneous  poly- 
nomials in  at  least  three  variables  and  of  the  same  degree,  we  see  that  restricting  to  irreduci- 
ble varieties  would  not  improve  the  lower  bound.  For  nonhomogeneous  complex  varieties 
the  lower  bound  in  Theorem  7.7  is  proportional  to  l/deg(M);  we  conjecture  that  this  factor 
may  be  removed. 

We  now  turn  to  an  asymptotic  expression  for  vol2A-(7"(Af  ,e))  when  M  is  homogeneous. 
The  expression,  not  surprisingly,  will  simply  be  \o\2d(M  [l])-62A'-2rfc2(A  ~d) >  Just  as  if  w[l] 
were  a  "rectangle"  and  T(M,e)  a  "rectangular  parallelepiped"  of  radius  e. 

To  prove  this,  we  will  need  some  notation.    If  M  is  a  smooth  J-manifold  in  IR^  let 

T'(M,e)  =    \JN(p,<i) 

where  N(p,t)  is  a  closed  N-d-baU  of  radius  c  centered  at  pdM  and  orthogonal  to  M.  We 
will  call  T'(M,e.)  a  tubular  neighborhood  if  each  v€7"'(A/,e)  lies  in  exactly  one  N(p,t).  If, 
for  example,  M  is  a  hypersurface  T'(M,e)  will  be  a  tubular  neighborhood  if  e  is  smaller  than 
the  radius  of  curvature  at  all  points  of  M . 

Our  main  tool  is  due  to  Weyl: 

Theorem  7.12:  [32].  If  M  is  a  closed  rf-dimensional  manifold  in  IRA  and  T'(M,e.)  is  a  tubu- 
lar neighborhood  of  M ,  then 

N 

volA(r'(W,e))  =      2    cKN(M)e.k  (7.15) 

k  =  N-d 

where 

cN-dN{M)  =  \old(M)-BN-d  . 

The  reason  we  cannot  apply  Weyl's  theorem  directly  to  varieties  (which  would  have 
saved  a  great  deal  of  effort!)  is  that  varieties  may  have  singularities,  so  that  7'(A/,e)  may 
never  be  a  tubular  neighborhood  for  any  c.  If  we  eliminate  a  small  neighborhood  of  the 
singular  part  of  a  variety,  we  may  apply  Weyl's  theorem  to  the  remainder.  If  we  can  show 
that  the  contribution  to  volA(7(A/,e))  of  this  neighborhood  of  the  singular  part  goes  to  0  as 
the  radius  of  the  neighborhood  shrinks  to  0,  then  the  first  term  of  Weyl's  theorem  will  pro- 
vide the  correct  asymptotic  behavior  of  \olN(T(M ,e))  for  small  €. 

The  proof  goes  as  follows.  Let  M  be  a  homogeneous  purely  2</-dimensional  complex 
variety  in  CN .  Let  S  (M ,r)  =  {xdM :dist(A:,5(Af))<r}  be  an  r-neighborhood_in  M  of  the  singu- 
lar part  of  M.  As  usual  S(M,r)[s]  will  denote  S  (M  ,r)(~)B(s).  Note  that 
lim  voUd(S(M,r)[s])  =  0.   Let  M in(r)  =  S(M,r)[l]   and  A/out(r)  =  {x€M[l]:  xiM,„(r)}.     Note 

r-0 

that  lim  vo\2d(M0Ul(r))  =  \o\2d(M[l]). 

r-0 

Now  for  all  r  we  may  apply  Weyl's  theorem  to  A/oul(r):  there  is  some  c(r)>0  such  that 
when  e<e(r),  T' (M oul(r),e)  is  a  tubular  neighborhood.  We  estimate  vol2A(r(A/,e))  as  fol- 
lows: 
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vol2A,(r'(Af0Ut(r)[l-  €],€))  s  voider  (A/,  e)) 

s  vol2A.(7"(Mout(r),e))  +  vol2A-(7-(Min(r),€))  (7.16) 

The  bounds  on  vo\2A<(T(M  ,e))  hold  because  7"(A/out[l-€],e)cr(Af  ,e)  and 
7-(M,e)C7/'(AfoulO)(e)Ur(Afin(r),e).  For  fixed  r  we  have  by  Weyl's  theorem 

™  vol2A,(r'(Mout(r)[l-6],€))    "    ™  vol2^(A/out(r))+0(€) 

It  remains  to  prove  that  vol2A'(r(A/in(r),€))  is  negligible  compared  to  vol2A'(r'(Az'om(r),e)) 
for  small  enough  r.  More  precisely,  we  need  to  show  that  for  all  tj>0  there  is  an  7>0  such 
that  for  all  r<r  and  €<e(r) 

vol2A,(r(MiB(r),c)) 

vol2A(r'(A/out(r),e))         V    '  (7"17) 

We  will  prove  (7.17)  by  showing  that  for  Arc<r<l, 

vol2A-(r(A/,n(r),e))  ^  e2N2(tf  -  i)2("-'-i>.l2£..VolM(S(Af,  2r)[2])-^N~^    .      (7.18) 

"2d 

Since  for  small  r  and  €<€(/•),  vol2A-(7'(A/out(r),€))  behaves  like  c-e2*-"'^,  where  c  is  a  con- 
stant near  62(A;_d)-vol2A.(A/ [1]),  the  expression  in  (7.17)  is  bounded  above  by  a  constant  times 
\o\2d(S  (M ,  2r)[2]),  which  goes  to  0  as  r  goes  to  0. 

We  will  prove  (7.18)  using  the  same  technique  as  in  Lemma  7.10.  Let  dK  =  dA  A  db  be 
the  kinematic  density  for  the  group  of  Euclidean  motions.  Then  [25,  eq.  15.20]  implies 

fvo\2d(S  (M  ,2r)[l  +  Ne]f}K  (B  (Ne)))dK  (7.19) 

=  02A-]  •  •  •  0,-vol2A(B(Are))-vol2d(5(M,?.r)[l  +  A^€]) 

Note  that  vol2A(S  (Ne))  =  e2N(Ne)2N  and,  since  Ne<l, 

vo\2d(S(M,2r)[l+N€])  <  vol2d(5(M,2r)[2]).  Now  if  dist(i>,Mj,,(r))  =s  e  and  \\b\\  ==  1,  then 
3p£Mm(r)CS(M,2r)[l  +  Ne]  such  that  \\b-p\\  s  €.  Furthermore,  since  K(B  (Ne))  =  B  (b,Nt) 

S(M,2r)[l  +  Nf.]f)B(p,(N-l)€)  C  S(M,2r)[l+Nt]C\B(b,Ne) 

and  so  by  Lemma  7.2  and  the  fact  that  Ne.<r, 

vo\2d(S(M,2r)[l  +  Nt](-)K(B(Ne)))^  Q2d-((N  -  l)e)w    . 

From  (7.19)  we  have  therefore  that 

eM-((#-l)€)2d  /  dA  hdb  <  02A-_,  •  •  ■Ov62N(Ne)2N-y,ol2d(S(M,2r)[2]) 

dist(b,Min(r))  s  * 

11*11  =  1 

or 

6w-((/V-l)e)M  vol^(r(Afin(r),e))  <  G^tf  e)2W-vol2(,(S(M,  2r)[2]) 
yielding  (7.18)  as  desired. 

In  summary,  we  have  proven 

Theorem  7.13:  If  M  is  a  purely  2rf-dimensional  complex  variety  in  CN,  then  for  asymptoti- 
cally small  e  Weyl's  theorem  correctly  estimates  \ol2^(T(M ,  e)).  In  particular,  when  M  is 
homogeneous 

voi2A,(r(A/,€))  =  deg(A/)-e2<ye2(A,_rf)-62<A'-^  +  0(€2(N-^)  . 
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Proof:  From  Lemma  7.1  we  have  vol:d(A/[l])  =  deg(Af  )-62<j.  ° 

Corollary  7.14:  If  M  is  a  homogeneous,  purely  2^-dimensional  complex  variety  in  C^,  then 
asymptotically 


/(A/,e)  =    g)-deg(A/)-€2(A'-^  +  o(e2<-N-^) 


This  completes  the  proof  of  Theorem  4.1. 

It  appears  that  Ocneanu's  proof  of  Theorem  4.3  may  be  modified  so  as  to  prove  Conjec- 
ture 4.15  in  an  analogous  way  as  Theorem  7.13:  the  contribution  to  the  volume  of  T(M,t) 
from  points  near  the  singular  part  of  M  is  asymptotically  negligible  compared  to  the  points 
away  from  the  singular  part,  to  which  we  can  apply  Weyl's  theorem. 

Now  we  turn  to  using  these  bounds  to  estimate  the  probability  distribution 
Prob(dist(p,A/)  s  e)  when  M  is  homogeneous  and  p  is  uniformly  distributed  on  the  unit 
sphere.  We  will  use  the  following  geometric  interpretation  of  this  probability:  let 
TS{M,(.)  =  {p  :  ||p  ||=1,  dist(p,A/)  s  e}.  Then  if  the  ambient  space  is  IRA  we  have  by  defini- 
tion 

p     km-  ,r     »,w     ^         vol.v_ ,(7-5(^,6)) 
Prob(dist(p,Af)  ^  e)  =  — 


0N-i 

Now  define  7"c(M,e)  =  {p  :  0<||p||<_l,  dist(p/\\p  ||,M)<€  vp  =0}.  Then  7"c(A/,€)  is  the  inter- 
section of  a  homogeneous  set  with  fl(l),  and  its  intersection  with  the  unit  circle  is  Ts(M,t). 
Clearly 

volv(rc(A/,£))  =  Jvol^CTsCAM))^-^  =   Z£W-l(W.e» 

o 

so 

vol,v(7-c(A/,€)) 


N 


Prob(dist(p,A/)se) 


6* 

i.e.  the  fraction  of  the  volume  of  the  unit  ball  occupied  by  Tc(M,t).  Since  Tc(M ,e)C7"(A/,e) 

volv(r(M,€)) 
Prob(dist(p,Af)  <  e)  s  AV    v        "   =  f(M,t) 

and  any  upper  bound  for/(Af,e)  is  an  upper  bound  for  Prob(dist(p,A/)  s  c);  this  proves  the 
upper  bounds  in  Theorems  4.2  and  4.5. 

Now  we  turn  to  asymptotic  expressions.  Suppose  M  is  homogeneous,  rf-dimensional  and 
embedded  in  IR  .  From  our  previous  discussion,  it  suffices  to  consider  tubular  neighborhoods 
and  ignore  singularities.  In  fact,  we  can  do  a  purely  local  analysis  by  writing  both  T(M,e) 
and  7"c(A/,€)  as  the  union  of  disjoint  sections  orthogonal  to  M,  and  comparing  the  volumes  of 
the  sections.  In  particular,  for  each  p  ZM,  \\p\\=  1,  let  N {p,i)  be  the  section  of  T(M,e)  whose 
"bases"  are  N  -  d-batts  fi(±p,e)  orthogonal  to  M  and  which  consists  of  all  line  segments  con- 
necting a  point  p  j  €fi(p,e)  to  a  point  p2  €S(-p,e).  Let  Nc(M,t)  be  the  section  of  Tc{M,i) 
with  the  same  bases  and  consisting  of  all  line  segments  connecting  p  :€B(p,e)  and  —p\.  Thus 
r(M,c)  =     lj    N(p,e)    and   TC(M,€)  =     (J    Nc(p,t)-     Let   dp    be    a   volume    element   of 

piM  piM 

IIP  11  =  1  IIP  11  =  1 

Ms  =  {p€M  ,  ||p||=l},  and  N(dp,e)  and  Nc(dp,t)  be  the  union  of  the  corresponding  sec- 
tions. Then  if  S  (r)  is  the  sphere  of  radius  r 

l  l 

vo\N(N(dp,<i))  =  2fvo\N-}(N(dp,€)r\S(r))dr  =  2JwolN ^(N (dp ,e)  nS (l))rd-'dr 
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=  2volA-_1(tf(dp,e)fV(l))  '  d 

and 

1  1 

volN(Nc(dp,t))  =  2/volA-_1(^c(^,€)n^('-))^  =  2jvolN-1(Nc(dp,t)riS(l))rN-1dr 


=  2volA._, (#(<//>, €)fV(l))  /#    . 

i.e.  the  volume  of  the  section  of  r(A/,e)  is  W/d  times  the  volume  of  the  section  of  rc(Af,e). 
Therefore 

/(A/,€)  voliV(7-(M,e))  # 

hm  i-i — ! — ' =  lim  =  — 

€-0    Prob(dist(p,M)  <  e)         c-o   \olN(Tc(M ,c))         d 

which  proves  the  asymptotic  results  in  Theorem  4.2  and  (4.19)  (the  latter  result  depends  on 
the  truth  of  conjecture  (4.15)). 

Finally,  we  consider  lower  bounds  on  Prob(dist(p,A/)  <  e).    We  will  show 

volA(r(M,e))  £  volA.-1(7s(A/,e))  (7.20) 

which  will  immediately  imply 

voI:,v(7-(Af,€))         vol2A.-1(r5(A/,e)) 
/(M,c)  = < =  W-Prob(distQ>,M)  s  e)    .     (7.21) 

"A  "A 

To  prove  (7.20),  consider  the  volume  element  dx  of  Ts(M,t).  Letp€A/  be  a  point  closest  to 
x,  and  construct  the  cylinder  with  base  dx,  axis  parallel  to  the  segment  Op  and  of  length  1. 
Since  the  length  of  the  cylinder  is  1  and  its  cross  section  is  at  most  dt,  its  volume  is  at  most 
dx.  The  union  of  all  these  cylinders  fills  up  T(M,e)  (perhaps  multiply),  so  integrating  their 
volumes  over  7"5(A/,e)  yields  (7.20).  (7.21)  shows  that  f(M,e)/N  is  a  lower  bound  on 
Prob(dist(p,A/)  s  e)  and  completes  the  proofs  of  Theorems  4.2  and  4.5. 
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